Authors: Ziyue Xiao(ziyuex), Minxue gu(minxue)

Obesity in Los Angeles

Location of fast food restaurants (McDonald’s, Burger King and Jack in Box) in Los Angeles

Based on the locations of fast food restaurants, we . And we made a buffer within the range of 1500 miles to inidacate whether

Income and race distribution

Equity analysis

Obesity among different income levels and race

In this graph, we examine differences in obesity rates among ethnic and income groups. It can be seen that with the increase of income, the overall obesity rate decreases, which may qualitatively mean that income is related to obesity, and the relatively low price of fast food may be the driving factor behind it, which may affect the food choices of people with different incomes. At the same time, in Los Angeles, different ethnic groups have different tendencies to fast food, because the probability of obesity varies to some extent with the same income, but the difference decreases with the increase of the overall income, indicating that income may play a more important role in the obesity rate.

Obesity among different race and whether to have access to fast food

In this graph we examine differences in obesity rates between ethnic groups and people who live near or far away from fast food restaurants.It can be seen that the obesity rate of people who live closer to fast food restaurants is generally higher than that of people who live far away from fast food restaurants. This may mean that the convenience of fast food restaurants has an impact on obesity qualitatively, and people may choose fast food because it is a convenient and fast way to get food.It’s worth noting that Blacks and Whites were most affected by proximity to fast food restaurants, while Asians were less affected。

Obesity among different income levels and whether to have access to fast food

In this graph we examine differences in obesity rates between income groups and people who live near or far away from fast food restaurants.In addition to the previous relationship between income and obesity rate, we can see that in the high-income group, the obesity rate of people near fast food restaurants is higher than that of people far from fast food restaurants, while in the low-income group, the difference is not so obvious, or even the opposite, which may indicate that the obesity rate of low-income people may be related to other factors.

Regression

Analyze the association between fast food convenience and obesity rates

## 
## Call:
## lm(formula = obesityRate ~ hasAccessToFastFood, data = obesity_access)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -13.1228  -4.2228  -0.4228   4.1772  22.8772 
## 
## Coefficients:
##                     Estimate Std. Error t value Pr(>|t|)    
## (Intercept)          26.2228     0.1313  199.68   <2e-16 ***
## hasAccessToFastFood   2.6763     0.2337   11.45   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.236 on 2322 degrees of freedom
## Multiple R-squared:  0.05347,    Adjusted R-squared:  0.05306 
## F-statistic: 131.2 on 1 and 2322 DF,  p-value: < 2.2e-16

This analysis compared the obesity rates of people who did not have access to fast food restaurants (access= 0) with those who did have access to fast food restaurants (access= 1). From the summary, the regression coefficient is 2.6763, which is the slope of the line or the difference in obesity rate between the two average scenario.This verifies the effect of proximity to fast food restaurants on obesity rates as previously seen in the equity analysis.

Analyze the association between household income and obesity rates

## 
## Call:
## lm(formula = obesityRate ~ householdIncome, data = obesity_access_race)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -15.871  -2.924   0.044   3.026  18.381 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      3.177e+01  8.639e-02  367.72   <2e-16 ***
## householdIncome -6.580e-05  1.030e-06  -63.86   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.428 on 11458 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.2625, Adjusted R-squared:  0.2624 
## F-statistic:  4078 on 1 and 11458 DF,  p-value: < 2.2e-16

This graph plots the relationship between income and obesity rate, we use ‘loess’ instead of ‘lm’ to draw a smooth curve. From the summary, we can see the tendency that a dollar raise in the income will lower the obesity rate by 6.580e-03%.The p value is small and the results are statistically significant.

Analyze the association between race and obesity rates

## 
## Call:
## lm(formula = obesityRate ~ Race, data = obesity_access_race)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -13.610  -3.885  -0.323   3.791  23.363 
## 
## Coefficients:
##                        Estimate Std. Error t value Pr(>|t|)    
## (Intercept)             25.1852     0.1272 198.016  < 2e-16 ***
## RaceBlack                3.2378     0.2018  16.047  < 2e-16 ***
## RaceHispanic             2.2243     0.1673  13.294  < 2e-16 ***
## RaceNative American      3.4700     0.6271   5.533 3.22e-08 ***
## RaceOther                3.3037     0.1760  18.775  < 2e-16 ***
## RacePacific Islander     2.9421     1.5209   1.934   0.0531 .  
## RaceTwo Or More          0.8496     0.2179   3.899 9.72e-05 ***
## RaceWhite                1.8521     0.1651  11.221  < 2e-16 ***
## RaceWhite Non-Hispanic   0.5516     0.1727   3.195   0.0014 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 5.027 on 11451 degrees of freedom
##   (6 observations deleted due to missingness)
## Multiple R-squared:  0.04998,    Adjusted R-squared:  0.04931 
## F-statistic:  75.3 on 8 and 11451 DF,  p-value: < 2.2e-16

Asian was taken alphabetically as a “baseline” against which other categories were compared.The regression coefficient by race category can be interpreted as the difference in obesity rate relative to Asians. It can be seen that the obesity rate of other races is higher than that of Asians, while the obesity rate of whites is lower and that of blacks is higher.While most of these associations were statistically significant, overall, racial differences explained only about 5 percent of the variation in obesity rates. # Multiple regression ————————————————- #Analyze the association between obesity with neighborhood income, ethnic composition and working hours in fast-food supply areas.

Multi-regression

## 
## Call:
## lm(formula = obesityRate ~ householdIncome + race + WKHP, data = obesity_access_race_True)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -14.2012  -3.2199  -0.1664   3.4465  19.2382 
## 
## Coefficients:
##                   Estimate Std. Error   t value Pr(>|t|)    
## (Intercept)      3.246e+01  7.353e-03  4414.421   <2e-16 ***
## householdIncome -7.475e-05  6.081e-08 -1229.244   <2e-16 ***
## raceblack        1.571e+00  8.201e-03   191.496   <2e-16 ***
## racehispanic     7.176e-01  7.179e-03    99.965   <2e-16 ***
## raceother        6.890e-01  6.349e-03   108.519   <2e-16 ***
## racewhite        8.068e-01  7.073e-03   114.061   <2e-16 ***
## WKHP             3.063e-05  9.206e-05     0.333    0.739    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 4.522 on 5486548 degrees of freedom
## Multiple R-squared:  0.2401, Adjusted R-squared:  0.2401 
## F-statistic: 2.889e+05 on 6 and 5486548 DF,  p-value: < 2.2e-16

The model examined the relationship between obesity rates and income, race and hours worked in the presence of nearby fast food restaurants.The data showed that when other factors were held constant, the longer a person worked, the more likely they were to eat fast food, possibly because long working hours compressed leisure time and people turned to fast food in the presence of fast food restaurants. At the same time, a dollar raise in the income will lower the obesity rate by 7.475e-03%, which suggests after controlling for other factors, the influence of income on obesity rate is stronger than before. While the relationship between race and obesity rates was similar to what had been seen before.

Predict Model

Build a predictive model in which obesity is the predicted variable that can take neighborhood income, ethnic composition as predictors and predict the presence of obesity.

## 
## Call:
## glm(formula = obesity_risk_population ~ race + householdIncome, 
##     family = quasibinomial(), data = predict)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.8196  -0.8163  -0.4309   0.9548   2.1414  
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)      1.556e+00  6.477e-02   24.02   <2e-16 ***
## race            -1.060e-02  7.627e-03   -1.39    0.165    
## householdIncome -4.069e-05  9.949e-07  -40.90   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for quasibinomial family taken to be 0.8523689)
## 
##     Null deviance: 13326  on 11465  degrees of freedom
## Residual deviance: 11001  on 11463  degrees of freedom
## AIC: NA
## 
## Number of Fisher Scoring iterations: 5

Here we create a variable (obesity risk) that is 1 if the household is close to a fast food restaurant and has a household income of less than 90,000, and 0 if it is not.

## # A tibble: 1 × 10
##   GEOID       Race        householdIncome householdIncome… obesityRate hasAccessToFast…
##   <chr>       <chr>                 <dbl>            <dbl>       <dbl>            <dbl>
## 1 06037295103 Two Or More          170871            50319        22.9                0
## # … with 4 more variables: TotalPopulation <dbl>, income <chr>, race <dbl>,
## #   obesity_risk_population <dbl>
##           1 
## 0.004189577
##    
## .   FALSE TRUE
##   0  2889 5506
##   1     0 3071

The bottom-right cell is the number of households at risk for obesity with an income of less than 90,000 and have access to fast food, and the model correctly predicted the result using variables of and income, race as predictors. The top-left cell is the number of households we assumed not at risk of lead, and the model predicted the same result. So 66% of records were correctly predicted one way or the other. The top-right cell is the number of households we assumed not at risk of obesity but the model incorrectly predicted them to be at risk which is “false positives”. The bottom-left cell is the number of households who are actually at risk for lead exposure, but the model incorrectly predicted them to be safes which is "false negative